[SDTEST-3751] Optimize split-aware parallelism planning by anmarchenko · Pull Request #60 · DataDog/ddtest

anmarchenko · 2026-05-19T15:03:29Z

What

DDTest now chooses the planned parallel runner count by evaluating a number of possible test splits instead of linearly interpolating from saved/skippable time. Now it is possible to plan the test execution more precise because we have statistics for the durations.

The distribution algorithm plans multiple possible splits between minParallelism and maxParallelism and evaluates them based on (in order of priority):

expected duration of the slowest worker
the most even split
the lowest number of test runners needed

This PR also adds human-readable plan and run reports, printed to stderr by default, so users can understand and debug what DDTest decided. Reports summarize run identity, Datadog feature settings, backend data, planning results, split quality, and worker execution outcome. Reports can be disabled with:

DD_TEST_OPTIMIZATION_RUNNER_REPORT_ENABLED=false

Expected planning report shape:

+++ DDTest: plan report

Run
  Service: checkout-api
  Repository: https://github.com/acme/checkout.git
  Commit: 9f3a1c7d2b4e
  Branch: feature/split-report
  Platform: ruby / rspec
  OS tags: os.platform=linux, os.architecture=amd64, os.version=6.8.0
  Runtime tags: runtime.name=ruby, runtime.version=3.3.4

Datadog
  Test Impact Analysis: enabled
    Test skipping: enabled
    Test impact collection: disabled
  Known tests: enabled
  Impacted tests: disabled
  Early flake detection: enabled
  Auto test retries: enabled
  Flaky test management: enabled

Backend data
  Known tests: 4 modules, 1,284 suites, 18,921 tests
  Skippable tests for this run: 312
  Managed flaky tests: 26 total, 8 quarantined, 3 disabled, 5 attempt-to-fix

Planning
  Test files discovered: 642
  Fully skipped files: 118
  Test files to run: 524
  Duration source: 431 known, 93 default
  Estimated time saved: 38.40%

Split
  Runners: 6
  Expected wall time: 4m12s
  Imbalance: 11s
  Total estimated runtime: 23m46s

Expected run report shape:

+++ DDTest: run report

Run
  Service: checkout-api
  Repository: https://github.com/acme/checkout.git
  Commit: 9f3a1c7d2b4e
  Branch: feature/split-report
  Platform: ruby / rspec
  OS tags: os.platform=linux, os.architecture=amd64, os.version=6.8.0
  Runtime tags: runtime.name=ruby, runtime.version=3.3.4

Execution
  Mode: CI node
  CI node: 2
  Local workers: 2
  Test files run: 87
  Duration: 3m58s
  Result: passed

Why

The previous runner-count calculation could over-provision workers for suites where additional runners did not reduce the slowest runner's estimated wall time. Scoring candidate splits by slowest-runner time, then imbalance, avoids launching workers that would sit idle.

The new reports make the planning decision easier to audit in CI logs. When a split looks surprising, users can see the Datadog settings, available backend data, duration-source coverage, estimated saved time, selected runner count, expected wall time, and imbalance without digging through artifacts or debug logs.

E2E testing

Run ddtest plan --platform ruby --framework rspec --min-parallelism 2 --max-parallelism 4 on a project with weighted runnable test files.
QA check: confirm stderr contains +++ DDTest: plan report.
QA check: confirm the planning report includes run info, Datadog feature settings, backend data, planning counts, duration-source counts, estimated saved time, and split metrics.
Inspect .testoptimization/runner/parallel-runners.txt and .testoptimization/runner/tests-split/runner-* to confirm the chosen runner count matches the fastest and most even estimated split.
Try different scenarios with different durations for test suites: does the rule hold that we select the split with the lowest wall time and try to make it as even as possible?
Run ddtest run --platform ruby --framework rspec using the generated plan artifacts.
QA check: confirm stderr contains +++ DDTest: run report.
QA check: confirm the run report includes run info, execution mode, CI node/local worker details when applicable, files run, duration, and pass/fail result.
Run plan and run again with DD_TEST_OPTIMIZATION_RUNNER_REPORT_ENABLED=false.
QA check: confirm neither report is printed while normal artifacts and test execution still work.

Validation performed: go test ./internal/runner, go test ./internal/runner -run '^$' -bench BenchmarkCalculateParallelRunners20000TestFiles -benchtime=1x, make test, and make lint.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 97267de802

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

anmarchenko · 2026-05-20T13:03:35Z

E2E Test Report: SUCCESS

Tested by: Shepherd Agent (autonomous QA for Datadog Test Optimization)

Test Environment

Method: Local Shepherd testing with mockdog backend
PR branch tested: anmarchenko/split-aware-parallelism
Revision tested: 2c23953327364b60d9cbf8c905c77a4e22fc6e13
Feature under test: split-aware parallelism planning plus human-readable ddtest plan and ddtest run reports

Results

Check	Playground	Status	Evidence
Split-aware planning evaluates candidate runner counts and chooses the best split	`forem` / RSpec	PASS	Considered 2, 3, and 4 runners. Selected 3 runners with `expectedWallTime=10s`, `imbalance=3s`, `expectedTotalRuntime=25s`, `testFilesCount=23`.
Plan report is printed by default	`forem` / RSpec	PASS	Stderr contained `+++ DDTest: plan report` with run, Datadog settings, backend data, planning, and split sections.
Persisted plan artifacts match the selected split	`forem` / RSpec	PASS	`parallel-runners.txt = 3`; `test-files.txt = 23`; runner union matched `test-files.txt`; no duplicate runner assignments.
Run report is printed by default	`sidekiq` / Minitest	PASS	Stderr contained `+++ DDTest: run report`; result was `passed`; mockdog saw 52 tests in 1 session.
Reports can be disabled	`sidekiq` / Minitest	PASS	With `DD_TEST_OPTIMIZATION_RUNNER_REPORT_ENABLED=false`, no `+++ DDTest` report headers were printed while execution still succeeded.

Test Methodology

Ran forem with the existing weighted mockdog scenario ddtest-suite-durations-forem-policies, scoped to spec/policies/*_spec.rb, using the PR branch via --dep ddtest=anmarchenko/split-aware-parallelism.
Verified the new split scorer selected the best candidate by checking debug logs and stashed .testoptimization/runner/* artifacts.
Ran sidekiq with ddtest run against test/api_test.rb to verify the run report path and actual test execution.
Re-ran sidekiq with DD_TEST_OPTIMIZATION_RUNNER_REPORT_ENABLED=false to verify reports are suppressed without breaking planning artifacts or execution.
Used mockdog reports as local backend verification. Datadog UI verification was not applicable because these runs targeted local mockdog rather than a real Datadog site.

Issues Found

No PR behavior issues found.

One local environment issue occurred before testing: rbenv was not on PATH, so the first crook dependency setup attempted to use macOS system Ruby and failed Bundler resolution. I reran the intended Shepherd workflow with Homebrew/rbenv on PATH; no workaround to ddtest behavior or playground semantics was used.

Example plan report

+++ DDTest: plan report

Run
  Service: forem
  Repository: git@github.com:anmarchenko/forem.git
  Commit: 97123b488f73b40a12fa6b1124d685352e8dc9fd
  Branch: main
  Platform: ruby / rspec
  OS tags: os.platform=darwin23, os.architecture=arm64, os.version=25.5.0
  Runtime tags: runtime.name=ruby, runtime.version=3.3.0

Datadog
  Test Impact Analysis: disabled
    Test skipping: disabled
    Test impact collection: disabled
  Known tests: disabled
  Impacted tests: disabled
  Early flake detection: disabled
  Auto test retries: disabled
  Flaky test management: disabled

Backend data
  Known tests: disabled
  Skippable tests for this run: disabled
  Managed flaky tests: disabled

Planning
  Test files discovered: 23
  Fully skipped files: 0
  Test files to run: 23
  Duration source: 23 known, 0 default
  Estimated time saved: 0.00%

Split
  Runners: 3
  Expected wall time: 10s
  Imbalance: 3s
  Total estimated runtime: 25s

Example run report

+++ DDTest: run report

Run
  Service: sidekiq
  Repository: git@github.com:anmarchenko/sidekiq.git
  Commit: 926d03e7ecda830b18e4868f57db6f7af3ce547f
  Branch: main
  Platform: ruby / minitest
  OS tags: os.platform=darwin23, os.architecture=arm64, os.version=25.5.0
  Runtime tags: runtime.name=ruby, runtime.version=3.3.5

Execution
  Mode: sequential
  Local workers: 1
  Test files run: 1
  Duration: 4.496s
  Result: passed

This E2E test was performed by Shepherd, autonomous QA for Datadog Test Optimization.

daniel-mohedano

🚀

Optimize split-aware parallelism planning

97267de

anmarchenko changed the title ~~Optimize split-aware parallelism planning~~ [SDTEST-3751] Optimize split-aware parallelism planning May 19, 2026

anmarchenko marked this pull request as ready for review May 19, 2026 15:04

anmarchenko requested a review from a team as a code owner May 19, 2026 15:04

chatgpt-codex-connector Bot reviewed May 19, 2026

View reviewed changes

Comment thread internal/runner/runner.go Outdated

anmarchenko added 10 commits May 19, 2026 21:47

Address split planning review feedback

0fc9fec

Remove optional split scheduling output

3554a32

Add split scheduling test coverage

7f933b1

Reorder split helpers by call surface

56170d1

Rename weighted runner split builder

5733656

Move distribution into split builder

76be219

Merge split helpers into distribution

77fa8da

Inline distribution helper wrappers

e933d72

Log selected parallel split metrics

37838c8

Add human-readable test run reports

2c23953

daniel-mohedano approved these changes May 20, 2026

View reviewed changes

anmarchenko merged commit adb588d into main May 20, 2026
3 checks passed

anmarchenko deleted the anmarchenko/split-aware-parallelism branch May 20, 2026 13:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SDTEST-3751] Optimize split-aware parallelism planning#60

[SDTEST-3751] Optimize split-aware parallelism planning#60
anmarchenko merged 11 commits into
mainfrom
anmarchenko/split-aware-parallelism

anmarchenko commented May 19, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

anmarchenko commented May 20, 2026

Uh oh!

daniel-mohedano left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

anmarchenko commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

E2E testing

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

anmarchenko commented May 20, 2026

E2E Test Report: SUCCESS

Test Environment

Results

Test Methodology

Issues Found

Uh oh!

daniel-mohedano left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

anmarchenko commented May 19, 2026 •

edited

Loading